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ABSTRACT 

The large scale infall of galaxies around massive clusters provides a potentially powerful diag- 
nostic of structure growth, dark energy, and cosmological deviations from General Relativity. 
We develop and test a method to recover galaxy infall kinematics (GIK) from measurements 
of the redshift-space cluster-galaxy cross-correlation function i s cg (r p , r w ). Using galaxy and 
halo samples from the Millennium simulation, we calibrate an analytic model of the galaxy 
kinematic profiles comprised of a virialized component with an isotropic Gaussian velocity 
distribution and an infall component described by a skewed 2D t-distribution with a character- 
istic infall velocity v ryC and separate radial and tangential dispersions. We show that convolv- 
ing the real-space cross-correlation function with this velocity distribution accurately predicts 
the redshift-space , and we show that measurements of can be inverted to recover the 
four distinct elements of the GIK profiles. These in turn provide diagnostics of cluster mass 
profiles, and we expect the characteristic infall velocity v rjC (r) in particular to be insensitive 
to galaxy formation physics that can affect velocity dispersions within halos. As a proof of 
concept we measure £* g for rich galaxy groups in the Sloan Digital Sky Survey and recover 
GIK profiles for groups in two bins of central galaxy stellar mass. The higher mass bin has a 
tJ fjC (r) curve very similar to that of 10 14 h~ 1 M Q halos in the Millennium simulation, and the 
recovered kinematics follow the expected trends with mass. GIK modeling of cluster-galaxy 
cross-correlations can be a valuable complement to stacked weak lensing analyses, allowing 
novel tests of modified gravity theories that seek to explain cosmic acceleration. 

Key words: galaxy: clusters: general — galaxies: kinematics and dynamics — cosmology: 
large-scale structure of Universe 



1 INTRODUCTION 

As the largest bound systems in the universe, galaxy clusters carry 
imprints of cosmic growth via the distributio n and motion of 
their surrounding dark ma tter and galaxies fsee lAllen et aljEoill ; 
iKravtsov & BorganSll2012L and references within). They can there- 
fore play a powerful role in testing theories for the origin of cos- 
mic acceleration, complementing geometrical probes such as su- 
pernovae and baryon acoustic oscillations. The key uncertainty 
in this approach is calibration of the relation between cluster ob- 
servables (e.g., X-ray luminosity or temperature, optical galaxy 
richness, Sunyaev-Zel'dovich decrement) and halo mass. Stacked 
weak lensing has emerged as a robust approach to this problem 
because it is unaffected by the b aryonic physic s of t he intraclus- 
ter gas dMandelbaum et al I 2006; ISheldon et all 2009; lOguri et alJ 
2012, see [Weinberg et al.l 2012 for a review of this approach in 
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the br oader context of d a rk energy studies). Galaxy infall pat- 
terns dGunn & Gotl 1 19721 : iRvden & Gunnl [l987l : iRegos & Gelled 
Il989h offer an alternative probe of cluster mass profiles, which may 
prove a valuable complement to weak lensing if it can be imple- 
mented in a way that is insensitive to uncertainties of galaxy forma- 
tion physics. The redshift-space cluster-galaxy cross-correlation 
function, is a comprehensive characterization of the statistical 
relation between clusters and galaxies, influenced by both the real- 
space cross-correlation and the peculiar velocities induced by the 
cluster gravitational potential. This paper is the first of a three-part 
series which will describe the modeling of an d investigate its 
diagnostic power for cluster mass calibration and constraining cos- 
mology. 

In cluster-centric coordinates, the average galaxy kinemat- 
ics are the result of competition between the cluster potential 
and Hubble expansion. At small distances, virial motion domi- 
nates, making the galaxy distribution elongated along the line-of- 
sight (LOS) direction (a.k.a. the "Fingers-of-God" effect; FOG). At 
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larger distances, the galaxy kinematics become dominated by ra- 
dial infall, with a typical turn-around radius of several 
where the characteristic infall velocity is equal to the Hubble 
flow. This coherent infall produces a squashing dis tortion in £f cg 
at large scales, often referr ed to as the Kaiser effect (iKaised 1987; 
see also ISargent & Turner 1977). These two redshift space distor- 
tion (RSD) effects (see iHamiltonl 1998 for a pedagogical review) 
are also s een in the redshi ft-space galaxy auto-correlation func- 
tions (e.g., Rei d et al J20ll and references within), but since galax- 
ies feel a much stronger central potential near clusters than in the 
field, both small scale dispersion and large scale infall are strongly 
enhanced in the £* g case. 

The strong RSD in £* g allows reconstruction of the aver- 
age galaxy kinematics around clusters, which are in turn deter- 
mined by the average cluster mass profiles. For clusters of fixed 
mass M, €cg( r Pi r n) at projected separation r p and LOS sep- 
aration 7v can be derived by convolving the real-space cross- 
correlation fu nction tj r rn with the LOS velo city distribution function 
f(v los \r p ,y) (Peebles 1980l: [FisheJfT995h . 



& g (r P ,r«) + 1 = [e cg (yjrl+y*) + l] * f(v los \r p ,y), (1) 

where y is the LOS separation in real space. The real-space 
£cg has a roughly power-law form, so the strongest features in 
£c 9 arise largely from f(v\ os \r P ,y). In this paper, we first de- 
velop an analytic description of the average galaxy infall kinemat- 
ics (GI K) around clust er-mass halos found in the Millennium sim- 
ulation dSpringell2005h . We show that applying this model to Equa- 
tion Q] accurately reproduces the simulated ^ s cg , and we show that 
the analysis can be reversed to infer the correct GIK from mea- 
surements of £cg (r Pl tv). As an illustrati ve application to o bserved 
data, we measure f° r galaxy groups (lYang et al.1 20071) identi- 
fied in the Sloan Digital Sky Survey (SPSS: I York et al.l 2000) and 
apply our model to infer the infall kinematics. 

At small scales, the redshift-space distribution of £* g depends 
mainly on the velocity dispersion profile of the virialized cluster 
component. At large separations, one must consider the mean ra- 
dial velocity profile and the profiles of radial and tangential veloc- 
ity dispersions. (In fact, we will show that the velocity distributions 
are significantly non-Gaussian and exhibit internal correlations, all 
of which must be modeled to describe £* g accurately.) All four of 
these profiles can be reconstructed from our £* g modeling tech- 
niques, and all four of them provide diagnostics of cluster mass. 
We are particularly hopeful that the mean radial flow will prove 
to be a tool for inferring average mass profiles that is insensitive 
to galaxy formation physics. The cluster velocity dispersion pro- 
file can be affected at the 10 — 20% level by orbital anisotropy 
and potentially by other effects such as preferential destruction of 
galaxies that pass near the cluster center or incomplete relaxation of 
substructures. The velocity dispersions of galaxies within infalling 
halos may also be affected by biases arising from galaxy forma- 
tion physics. However, the mean velocity of galaxies in an infalling 
halo should, on average, match the mean velocity of the halo itself, 
as galaxies and dark matter are affected by the same gravitational 
potential. We will test this conjecture in future work. 

Around individual clusters, the galaxy distribution in the (r p , 
tv) plane shows a distinctive trumpet-shaped pattern, and the dis- 
tributi on in ry at fixed r v of ten shows a caustic-like disconti- 
nuity (Diaferio & Gelle d 19971) . N-body simulations suggest that 
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the caustic location provides a direct measure of the escape ve- 
locity p rofile, wh ich can i n turn be co nverted to a cluster mass 
profile toiaferiol 1999; s ee ISerra et al.l 2011 for the current state 
of the art). iRines et alj J2003I) have used this technique to in- 
fer cluster mass profiles extending beyond th e virial radius (als o 
see IRines & Diaferidl2006l : iGeller et al.ll2012l : IRines et al.ll2012h . 
However, measurements for any individual cluster are affected 
by galaxy shot noise and by departures from spherical symme- 
try dWhite et al.ll20lok and it is not clear whether averaging mea- 
surements from multiple clusters will yield an unbiased mean re- 
sult. Our initial motivation for this study was, in part, to general- 
ize the "caustic method" to the case where an overlapping clus- 
ter survey and galaxy redshift survey provide a large total number 
of cluster-galaxy pairs, even though the numbers around an indi- 
vidual cluster may be too small for caustic detection. Although 
we will draw connections between £* g and the trumpet-shaped 
patterns of the caustic method, our approach ultimately does not 
rely on finding discontinuities in the data or identifying them with 
the escape velocity profile. Instead it relies on the general predic- 
tions of velocity distributions around massive halos in cosmologi- 
cal N-body simulatio ns. Previous analyses of £* g include the study 
of lCroftetal.1 fl999h using APM galaxy clusters and the study 
of iLi et al using velocity dispersion profiles of groups in 

the SDSS. Both studies assume Gaussian LOS velocity distribu- 
tions with a mean radial infall profile. Here we adopt a more com- 
prehensive approach to model not only the first two moments of 
velocity distributions, but the full GIK from the inner 1 /i _1 Mpc to 
scales beyond 40 h~ 1 Mpc. 

One can imagine two somewhat different ways to go from 
GIK modeling of £* g to constraints on cosmological parameters. 
One is to calibrate the average masses of clusters in bins of clus- 
ter observables, then combine this calibration with cluster counts to 
extract cosmological constraints via the halo mass function, anal- 
ogous to approach of Rozoetal using stacked weak lens- 
ing. The second is to extract constraints directly from £* g itself, 
marginalizing over uncertainties in galaxy bias. Our primary goals 
in this paper are to understand the physical origin of the features 
in £cg and develop a method for inferring GIK statistics from £* g 
measurements. In the second paper of this series we will investi- 
gate GIK as a method for measuring mean cluster mass profiles, 
with particular attention to galaxy bias effects. In the third paper 
we will investigate the cosmological information that can be de- 
rived fr om £?. n , in compar i son to and com bination with cluster weak 
lensing teozo et al.ll2010l : IZu et al"]l2012h . 

In the context of standard dark energy models, £* g and stacked 
weak lensing analyses involve different types of systematic un- 
certainties. In the context of modified gravity theories of cos- 
mic acceleration, they also provide distinct information. Gravita- 
tional lensing and the motions of non-relativistic tracers are af- 
fected by different combinations of gravitational potentials, which 
are equal in GR but un equal in many modified gravity theo- 
ries (I Jain & KhourvlEolob . Furthermore, transitions between "un- 
shielded" and "shielded" regimes of modified gravity can produce 
disti nctive features in the d e nsity and velocit y fields around clus- 
ters jLombriser et al.l 120121 ; lLam et al.l 12012), which may reveal 
themselves as unusual features in We will investigate this sen- 
sitivity to alternative gravity theories in future work. 

We begin our study by characterizing the GIK mass halos in 
the Millennium simulation with a compact analytic description. In 
^3]we show that the GIK model, in combination with the real-space 

accurately reproduces the Millennium £* g . In 33] we show that 
the GIK parameters can be reconstructed from measurements of 
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Figure 1. Three representations of the average galaxy distribution around Millennium clusters with mass between 1.259 — 1.585 x 1O 14 M0 in the redshift- 
projected separation plane. Left panel: Number of stacked galaxies in r p -cAz cells. Middle panel: Cluster-galaxy correlation function in redshift space f £ . 
Right panel: normalized by the projected cluster-galaxy correlation function w p at each r p . The yellow solid U-shape curve delineates the characteristic 
scale of 7v at which ^ g drops by one e-fold at fixed r p , relative to its value at tv = 0. The contour levels for each panel are colour-coded by the top colour 
bar. 



^ cg , mm A n aJwe apply our methodology to measurements of £* g 
for SDSS groups. We summarize our results and discuss prospects 
for this approach in ^6] 



2 £c g AND GALAXY INFALL KINEMATICS AROUND 
MILLENNIUM SIMULATION HALOS 

To investigate the average galaxy velocity distribution as a func- 
tion of cluster-centric radius, we make use of the semi -analytic 
mode l (SAM) galaxies inside the Millennium simulation dSpringell 
120051) . which evolves 2160 3 dark matter particles with Mp — 
8.6 x 10 8 /i _1 M in a periodic box 500/i _1 Mpc on a sideQThe 
underlying cosmological model is inflationary cold dark matter 
with a cosmological constant (ACDM), though the values of the 
matter density Q m and power spectrum normalization as are some- 
what high compared to current estimates. The SAM galaxies are 
then constructed by assigning an empirical gal axy formation recipe 
along the merger trees of dark matter halos foe Lucia & Blaizotl 



along 
l2007l) 



Since the galaxy kinematics are primarily determined by 
the long term gravitational forces that are unaware of the detailed 
baryonic physics inside galaxies, we can mostly treat the kinemat- 
ics of galaxies and their host sub-halos as equal. The only excep- 
tion is within the cluster virial radius, where the prescriptions for 
dynamical friction may alter the kinematics of galaxies after their 
host sub-halos fail to survive abov e the resolution limit of sim- 
ulation because of tidal tru ncation jGhigna et alj|2000l : iGao et aD 
12004 1 Kravtsov et al. 2004). The particular dyna mical friction pre- 
scription adopted in |Pe Lucia & Blaizotl J2007l) uses a variant of 
the classic formula from lBinnev & Tremaind Jl987h to calculate 
the friction time scale. After this time scale, galaxies are assumed 
to merge onto the central galaxies of the main halo. The uncertain- 
ties in the dynamical friction time scale are significant, but they 



2 All the distances in the paper are in comoving units, relative to halo cen- 
ters. All the kinematics are relative to halo center-of-mass velocities. 



only affect the very inner region that our analysis is insensitive 
to. Therefore, we believe the galaxy kinematics measured from the 
SAM sample are representative of ACDM+General Relativity (GR) 
models. We will comment on the potential impacts of different dy- 
namical friction prescriptions on our GIK model in Paper II. 

To obtain a decent signal-to-noise ratio, we select SAM galax- 
ies with absolute SDSS r-band magnitude M r < —18 at z = 0.1, 
which produces a sample size of 6791721 galaxies. We also re- 
peated our kinematics measurements using a brighter sample with 
M r < — 20 and found negligible differences except for relatively 
larger noise. We nonetheless will investigate the dependence of 
GIK on galaxy type and luminosity in Paper II. We selected clus- 
ter^ in six mass bins with 0.1 dex in bin width, which corresponds 
to at most 25% difference in mass, and < 8% difference in overall 
velocity amplitude within each bin (assuming characteristic veloc- 
ities v oc M 1//3 ). The narrow bin width ensures that the scatter we 
measure in the galaxy velocities within each mass bin is intrinsic, 
instead of extrinsic scatter induced by the scatter in cluster masses 
within that bin. The mass bins are indicated by the top left panel 
of Fig. HI (discussed further below). The clusters are identified by 
spherical overdensity in the simulation, and the cluster mass M c is 
defined as the mass inside a sphere with enclosed density 200 times 
the mean matter density fi m . Throughout our analysis, we choose 
our fiducial mass bin to be M c = 1.259 - 1.585 x 10 14 /i _1 M©. 

Fig. Q] compares three different ways of illustrating the aver- 
age distribution of galaxies around clusters in redshift space, using 
our fiducial cluster mass bin as an example. The left panel sim- 
ply shows the stacked number counts in cells of equal area on the 
redshift-r p plane. As mentioned in the introduction, this resembles 
the traditional way in which the "caustic" curve is identified for in- 
dividual clusters — we can see an enhancement of the galaxy num- 
ber distribution at small redshift separations forming the trumpet- 

3 We use the terms clusters, groups, and dark matter halos nearly inter- 
changeably throughout the paper, though "groups" should be understood to 
refer to the low end of the cluster mass range. 
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Figure 2. Joint probability distributions of radial and tangential velocities P(v r , vt ) from the simulation (top panels) and the best-fit using our GIK model (bot- 
tom panels), in four different radial bins marked at the bottom of each panel. The colour scales used by panels in the same column are identical, indicated by 
the colour bar on top. 



shaped pattern. At small r p , the galaxy distribution is stretched 
along the LOS by virial motions, but at r p ~ 2 /i _1 Mpc it is highly 
compressed along the LOS by infall. While there are strong gradi- 
ents in TV at each r p , there is not an obvious line of discontinuity, 
perhaps because caustics of individual clusters are washed out by 
scatter in the stack. 

Note that although the cells in this representation have equal 
area, those at large r p have larger volumes in 3 -dimensional red- 
shift space because each cell represents a cylindrical ring with ra- 
dius r p (Vceii rp). The middle panel shows the central quantity 
in this paper, the redshift-space cluster-galaxy correlation func- 
tion ^ s cg . Since it is equivalent to the overdensity of galaxies around 
clusters, the dominant feature we see, other than the FOG and 
Kaiser effects, is the declining trend of overdensity with distance. 
To highlight the cluster RSD effects at fixed r p , in the right panel 
we plot the contours of S, which is defined as the ratio between 
£,cg( r Pi r ir) an d the projected correlation function w p (r p ) at given 
r p . Since w p (r p ) is the integral of £*g along tv at fixed r p |j the 
ratio S has no information that is not already in £* g , but by scaling 
out the radial trend it highlights the RSD effects. The highest peak 
for S is no longer at the origin (i.e., cluster center), but migrates 
horizontally to the region around r p ~ 2/i _1 Mpc, meaning that 
the relative distribution of galaxies along tv is the most compact 
at r p ~ 2/i -1 Mpc and becomes more spread at both small and 
large scales. The yellow solid curve in the right panel quantifies 
this compactness — it shows the characteristic LOS distance 7V, C 



4 w p( t p) = f-^ £,cg {^\J r p + y 2 ^) dy- in practice, we cut off the inte- 
gration at y = ±40 h~ 1 Mpc. 



at which the amplitude of S drops by 1/e from £(7v — 0) at each 
r p . The curve has a characteristic U- shape, indicating a more com- 
pact distribution of galaxies at intermediate projected radii. Note 
that because S oc £* g at each r p , this U-shaped curve would oc- 
cupy the same locus in the £* g plot of the middle panel, although 
we do not show it there. 

What determines the location of this characteristic U-shaped 
curve? It bears some resemblance to the caustic curve for individ- 
ual clusters, where the fall-off of the galaxy distribution on the 
redshift-r p diagram is believed to be the natural boundary imposed 
by the escape velocities at different radii. Indeed, in Fig. Q] the U- 
shaped curve in the right panel is similar in shape to isodensity 
contours in the left panel, with the normalization by the mean ex- 
pected galaxy number removing the somewhat arbitrary geometric 
weighting (oc r p ) of the straight number-count diagram. However, 
with no clear discontinuity in ^ s cg or S, we cannot relate this curve 
directly to escape velocities. To explain it quantitatively, we need to 
understand the average galaxy velocity distribution function around 



clusters, and in particular, f(v\ 



,y)- 



For each mass bin of clusters, we stacked all the galaxies in 
the cluster-centric coordinate to produce a synthetic cluster of that 
mass bin. Although individual clusters vary in shape and lumpi- 
ness, the synthetic clusters are isotropic and smooth by construc- 
tion. Therefore, in order to fully describe f(v\ 08 \r p ,y), we only 
need to measure the joint probability distribution function (PDF) 
of radial velocity v r and "half" the tangential velocity component 
v t , at each radius r, P(v r , v t \r), so that 



f(vios\r P ,y) 



/ + oo 
-oo 



P [Vr , V t 



V\o 



v r sin t 



cos 



dv r 

cos ' 



(2) 
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Figure 3. Probability distributions of tangential (top panels) and radial velocities (bottom panels) at the same four radial bins as in Fig. [2 In each panel, 
the gray solid curve shows the ID marginal distribution. For the left two panels, red dashed and blue dotted curves show the relative contribution from the 
virialized and infall components, respectively; for the right two top (bottom) panels, blue, green, and red curves show the conditional probability distributions 
of vt (v r ) at three different v r (vt), as labeled. 



where r = y/r% + y 2 and 9 = tan _1 ^/r p . Here v t repre- 
sents the tangential velocity (vt) component that is projected in 
the plane of LOS axis and galaxy position vector in the cluster- 
centric frame. Given an isotropic cluster, this projected component 
is v t = \vt \ cos ip where ip is randomly distributed between —90 
and 90 degrees, hence "half" of vt - To avoid redundancy, hereafter 
we refer to vt simply as the "tangential velocity". Note that we sub- 
tract Hubble flow when defining v r , and the probability distribution 
of v t is symmetric about zero. 

The top panels of Fig. [2 shows the measured P(v r ,vt) for 
four radial bins that represent four distinctive regimes of GIK 
around clusters. Starting from the innermost radial bin (r = 0.0- 
— 0.4/i _1 Mpc), the joint velocity distribution appears to be a single 
component ellipse with the major axis in the radial velocity direc- 
tion, indicating a preference for radially oriented orbits that may 
arise from galaxies on first infall or sec ond passage, whose velocity 
directions have not been randomized te ertschingerl 1 1 9 85h . Going 
slightly further out (r = 1.6 — 2 .0h~ 1 Mpc), the joint distribution is 
clearly resolved into two parts, one virialized Gaussian component 
with zero means of v r and vt, and one radial velocity-dominated 
component skewed toward negative radial velocities (infall). At 
r = 4.0 — 4.4/i _1 Mpc, near the turn-around radius where infall 
velocity is comparable to Hubble flow, the virialized component 
disappears while the infall component has a mean v r — 400 kms -1 
but no skewness. On very large scales (r = 20.0 — 20.4/i _1 Mpc), 
the joint distribution is still largely infall, but skewed toward posi- 
tive velocities. The joint velocity distribution measured for any of 
the other radial bins is qualitatively similar to one of the four bins 
shown here, or some interpolation between two of them. To ensure 
an accurate description of f(vi OB \r p , y) at (r p , tv) < 30 /i -1 Mpc, 



where the measurements of ^ s cg are most precise, we need to model 
P(v r ,vt) across all scales from the inner 1 h~ 1 Mpc to beyond 
40/i~ 1 Mpc. 

Motivated by the top panels of Fig. [2 we adopt a two- 
component mixture model for the velocity distribution at any given 
cluster-centric radius r, with the virialized component described 
by a 2D Gaussian Q and the infall component by a 2D skewed t- 
distribution T: 

P{v r ,Vt) = P(V) = /vir ■ S(v) + (1 - /vir) ■ T(v), (3) 

where / v i r ^ is the fraction of galaxies in the virialized com- 
ponent, approaching zero at large r. We refer to the radius beyond 
which /vir = as the "shock radius" r s h, since it marks (at least 
within the model) the boundary between single-component and 
two-component flow. By definition Q has zero mean in both ra- 
dial and tangential axes, and we find it adequate to assume equal 
dispersions, making Q a function of only one parameter, the virial 
dispersion cr v ir (which is still allowed to vary with r). For the in- 
fall component, describing the varying degrees of skewness and 
kurtosis at different r requires a functional form T with greater 
com plexity. We adopt the skewed t-distribution parameterization 
from lAzzalini & Capitaniol d2003h . with two parameters describing 
the higher order moments of the velocity distribution (a and dof ) 
in addition to three parameters for the mean and dispersions (v r , c , 
cr ra d, and atan). The full expression is 

T(v) = 2£ 2 (v;dof)x 

where v — (iv,c,0), ol — (a, 0), and u> — (cr ra d, cr ta n) are 2- 
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Figure 4. Best-fit GIK (galaxy infall kinematics) model parameters as functions of radius for six mass bins, (a): characteristic infall velocity; (b): fraction of 
the virialized component; (c): velocity dispersion of the virialized component; (d): radial velocity dispersion of the infall component; (e): tangential velocity 
dispersion of the infall component; (f): parameter for describing skewness in the radial velocity axis; (g): degrees of freedom for overall kurtosis. 



element vectors, and Q v = (v — v) T S 1 (v — v) is a scalar where 



2 

^rad 






2 

°tan 



(5) 



For the two rhs terms in Equation [4] £2 is the density function of 
2D £-variate with dof degrees of freedom, 



£ 2 (v;dof) 



r{(dof + 2)/2} 
|E| 1 /2( 7r dof) 1 /2r(dof/2) 



1 + 



Qv 
dof 



(dof +2)/2 



(6) 

and Ti (x; dof + 2) denotes the scalar £-distribution function with 
dof + 2 degrees of freedom. Generally speaking, a controls the 
skewness of P(v r ,vt) in the radial velocity direction, while dof 
adjusts the kurtosis in both directions, with lower dof correspond- 
ing to longer non-Gaussian tails. Since P(v r ,v t ) is symmetric 
in the tangential velocity axis, ol is reduced to one parameter a. 



cr ra d and otan describe the dispersion in each direction, and v r , c 
is the characteristic radial velocity. Therefore, we have seven pa- 
rameters in total for P(v r , v t ) at every r: virialized fraction / v ir, 
velocity dispersion of the virialized component cr v ir, characteristic 
infall velocity v r , c , two velocity dispersions of the infall compo- 
nent cr ra d and atan, skewness parameter a, and kurtosis parameter 
dof (effectively reducing to five parameters at r > r s h). With seven 
parameters, Equation [3] provides an excellent fit for the measured 
P(v r , v t ) at all scales, as shown visually in the bottom panels of 
Fig. |3 and in greater detail below. We considered other parameteri- 
zations for the infall component, such as sums of Gaussians, but we 
were unable to find a compact description as accurate as the skewed 
£-distribution, so we obtained poor results in modeling ^ s cg . 

Using the best-fit GIK models, we take a closer look into the 
properties of P(v r ,v t ) at different radii in Fig. [3] In each panel, 
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Figure 5. Functional fits (black curves) to the radial profiles of each GIK parameter measured in simulation (blue circles). Similar to Fig. [4] but only for the 
fiducial mass bin (M c = 1.259 — 1.585 x 1O 14 M0). In each panel, the gray shaded area below the radius of maximum infall indicates the regime where 
cut-offs are required to match the mixing of virial and infall components as shown in the left panel of Fig. [2 Dashed vertical lines indicate the turn-around 
radius, where the Hubble flow (gray solid line in the top left panel) is equal to the characteristic infall velocity and the skewness parameter a crosses zero in 
the bottom left panel. 



the gray thick curve shows the ID marginal probability distribu- 
tion of tangential (P(v t ), top) or radial (P(v r ), bottom) velocities. 
In the left two columns where r < r s h, the ID marginal prob- 
ability distributions of vt (top) and v r (bottom) are decomposed 
into virialized (red dashed) and infall (blue dotted) components. 
For the innermost bin, the infall component has a much broader 
spread in radial velocities than in tangential velocities, signaling its 
non- virial origin (also see Fig. [2}. The second bin shows a more 
prominent infall component with much smaller radial dispersion in 
the mixture (bottom). In the right two columns where only infall 
happens, we show the conditional probability distribution of v t at 
three fixed values of v r in the top panels, and vice versa in the 
bottom panels. For the r = 4.0 — 4.4 h~ 1 Mpc bin, as is also ap- 



parent in Fig. [2 the conditional distributions show little skewness. 
However, the conditional probability of v t shows higher kurtosis 
at more probable values of v r . In other words, if we divide the in- 
fall population at fixed r into streams of different v r , the dominant 
streams have a more sharply peaked tangential velocity distribu- 
tion P(vt\v r ). For instance in the 3rd column, the blue curve in the 
top panel shows P(v t \v r ) for the stream of v r — — 480 kms -1 , 
which is near the peak of P(v r ) according to the bottom panel. 
Compared to the other two streams of zero (green) and positive ra- 
dial velocity (red), the blue distribution has many fewer galaxies 
with \v t \ > 500 kms -1 . Similarly, the conditional probability of 
v r also shows higher kurtosis when v t is closer to zero (e.g., green 
curve in the bottom panel). This kurtosis relation between v t and 
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v r is also apparent in the r = 20.0 — 20.4 /i _1 Mpc bin, and it is 
ubiquitous at other distances. In addition, extra skewness in P(v r ) 
develops at other distances, and the degree of skewness correlates 
with vt, as seen in the bottom right panel. The skewness switches 
sign at scales below the turn-around radius, where the radial ve- 
locity distributions are negatively skewed (second column, bottom 
panel, blue dotted curve). The GIK model provides a faithful de- 
scription of the skewness and kurtosis measured in the simulations, 
but to avoid clutter, we do not show the simulation measurements 
in Fig. [3] — the goodness of fit will be ultimately tested in the mod- 
eling of f(y\ os \r p ,y). 

We fit the GIK model with seven parameters to the measure- 
ments of P(v r ,v t ) at radial bins from to 40 /i _1 Mpc for six 
bins of clusters in the simulation. The results are shown in Fig. [4j 
and we will discuss each panel in turn. The top left panel shows 
the profiles of the characteristic infall velocity v r , c - At given mass, 
the absolute value of v r , c becomes larger with decreasing distance, 
peaking at some characteristic radius of maximum infall r m i- Be- 
low r m i, as we see in the left column of Fig.|2] the infall component 
blends into the virialized population, reducing v r , c sharply to zero. 
The amplitude of v r , c scales with mass as approximately M 1//3 , 
therefore providing a clear diagnostic of cluster masses. The top 
middle and right panels show the profiles of virialized fraction / v ir 
and dispersion of the virial velocities cr v i r , respectively. The cut- 
offs below ~ 1 h~ 1 Mpc are caused by the same blending effect 
below r m i, where / v i r stays approximately constant at 0.65. In this 
regime, the seperation between the virialized and infall components 
is no longer sharp (see Fig. [3 left), so while our fit to P(v r , v t ) is 
accurate, the physical significance of individual parameters is less 
clear. There are plateaus in the cr v i r profiles at r~2 — 4/i _1 Mpc 
depending on mass, possibly indicating the pre-heating induced 
by shear flow at the surface where infall and virialized component 
first contact; however, the virialized component is only ~ 10% of 
the total at these radii. The amplitude of cr v ir profiles and the ex- 
tents of both cr v ir and /vir profiles scale with mass. Returning to 
the infall component, cluster masses affect both the amplitude and 
shape of the cr ra d (middle left) and a tan (middle right) profiles. 
More massive clusters induce infall streams with larger (smaller) 
cr ra d on smaller (larger) scales, while a tan increases monotoni- 
cally with cluster mass on all scales r ^ 20/i -1 Mpc. Finally, 
there is little variation of the profiles of a and dof with mass. 
The a profiles cross zero at different distances depending on the 
v r , c profile of each mass bin, but since a decreases slowly with 
distance at fixed mass, the amplitudes of different curves do not 
change much. The degrees of freedom start from extremely high 
values (near-Gaussian tails) at the cluster center, reach a minimum 
at ~ 5/i _1 Mpc (a Lorentz distribution has dof = 1), then rise 
up again on large scales. Despite the insensitivity to cluster mass, 
the systematic variation of a and dof with radius, not otherwise 
captured by simple Gaussian or exponential streaming models, is 
pivotal to the accurate modeling of f(v\ os \r p ,y) and £* g at rele- 
vant scales. The interdependence of radial and tangential velocities, 
seen in Fig. [3] are aslo required for accurate modeling. The cutoffs 
of crtan and a profiles on small scales have the same blending ori- 
gin as those of v r , c and / V i r . 

To summarize, Fig. HI shows that higher mass clusters have, as 
expected, higher amplitude characteristic infall curves v r , c , viri- 
alized components with higher velocity dispersion a v - ir that ex- 
tend to large radii r s h, and higher tangential velocity dispersions 
ctan within the infall component. The radial dispersion cr ra d has 
weaker mass dependence that reverses sign at r~5/i _1 Mpc, and 
the profiles of a(r) and dof (r), which control the shape of the 



distribution function of the infall component, show nearly univer- 
sal, mass-independent bahavior. We are especially interested in ex- 
tracting the characteristic infall curves v r , c (r), as these probe the 
extended mass profile around clusters and should be insensitive to 
physics that may alter the dispersions of satellite galaxies within 
halos. The smooth behaviour and systematic trends in Fig. |4] sug- 
gest that isolating v r , c will be feasible, and the profiles of a v - ir (r) 
and cr t an (r) offer additional mass diagnostics. The main area of un- 
certainty is the cutoff behaviour below the maximum infall radius 
r m i « 1 h~ 1 Mpc. This should have little impact i n practical ap- 
plicat ions, as shot noise and "fiber collision" effects felanton et al.l 
120031) make £*g difficult to measure at small r p , and £*g at a given 
r p is absolutely unaffected by any scales r < r p . 

To describe the variations of these seven parameters with r, 
we fit them with analytic functions chosen to match the numerically 
measured shapes. We fit v r , c (r) with a 3-parameter function, 



v r ,c(r) = qi 



Pi 



(r + r mi )^i : 



(7) 



where r m i is the maximum infall radius beyond which v r , c is ex- 
ponentially cut off to zero, q± controls the maximum amplitude, 
and pi and f3i together control the asymptote and slope of the large 
scale power-law behaviour, respectively. The virialized fractions 
are well described by a "powered exponential", 



Mr) 



expS-(- 



(8) 



where ro is a characteristic radius of the cluster. We set the radius 
at which f v i T (r) falls to 0.3% as the shock radius r s h, which cor- 
responds to 1.8 ro, and we set fvir(r > r s h) = 0. The velocity 
dispersion profiles of the virialized component can be described by 



(Jo 



{'♦K)T 



(9) 



where we find v — 1.315 provides a good fit to all the mass bins, 
and we impose the constraint that the plateau happens at r — r s h, 
so that r* is determined by the best-fit r s h from / v ir via r* = 
2 _1//l V s h = 0.59 r s h. The remaining four nuisance profiles are 
well described by one generic function, 



{Crad, CT ta 



dof} (r) = qi— pi 



(r + n)h 



e {2,3,4,5}, 
(10) 

where the effects of qi, pi, and /3i are similar to that of qo, po, and 
po in Equation [7] except that the minima happen at n/(/3i — 1) 
instead of 

The best-fit functions are shown in Fig.|5]for the fiducial mass 
bin. The gray shaded area in each panel indicates the r < r m i 
regime, where we apply simple cut-offs (exponential or poly- 
nomial) to mimic the rough measurements from simulation. We 
emphasize again that the cut-offs do not affect the modeling of 
f(v\ os \r Pl y) and £* g at r p > r m i. The gray line in the top left 
panel indicates the amplitude of Hubble flow, crossing v r , c (r) at 
the turn-around radius, which is indicated in each panel by the ver- 
tical dashed line. 



3 MODELING OF £c g 

The GIK model describes average galaxy motions around clus- 
ters in 3D, but the modeling of ^ s cg requires predicting the ID 
galaxy motions projected along the LOS at any given (r p , y). Fig. [6] 
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Figure 6. LOS velocity distribution in nine cells of different projected (r p ) and LOS (y) separations in real space for the fiducial mass bin. Blue histograms and 
black curves show the measurements from the simulation and the prediction from our best-fit GIK model. The inset panel on top left of each panel indicates 
the relative position of each cell (red dot) relative to the cluster center (yellow quadrant). 



compares the LOS velocity distribution f(v\ os \r p , y) measured di- 
rectly from the simulation to that predicted from the best-fit GIK 
model (i.e., using the functional fits shown in Fig.O, at nine dif- 
ferent cells around clusters in the fiducial mass bin. The relative 
position of each cell around the cluster center is indicated by the 
red dot in the inset quadrant. Although the GIK model is calibrated 
using the same simulation that f(v\ os \r p , y) is measured from, the 
agreement we see in Fig. [6] in all panels is highly non-trivial — 
the 2D mixture model accurately recovers the varying degrees of 
skewness and kurtosis of LOS velocity distributions at different po- 
sitions of (r p , y), whereas a less comprehensive model (e.g., only 
fitting to the first and second moments, or treating tangential and ra- 
dial velocities independently) would fail. The agreement in Fig. [6l 



also reconfirms that our best-fit GIK model provides an excellent 
description of the galaxy kinematics in the Millennium simulation. 

Once we predict f(v\ os \r p , y), it is straightforward to predict 
£cg by convolving f(v\ os \r Pl y) with the real-space cluster-galaxy 
correlation function then compare to the measured directly 
from simulation. The convolution is, expressed in a form more ex- 
plicit than Equation [TJ 

£ g (r p , tv) + 1 = Ho \C cg (Jr 2 P + y 2 ) + ll x 

J — oo 

f(H Q (r v -y)\r p ,y)dy. (11) 

To make sure that the £* g comparison is unaffected by any inac- 
curacies in we use the directly measured from simulation 
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Figure 7. Comparison between the predicted and measured at nine different projected separations for the fiducial mass bin. Blue circles and red triangles 
with error bars show the simulation measurements with peculiar velocity turned on and off, respectively. Solid curves show the predictions from the best-fit 
GIK model when peculiar velocity is on, while dashed curves show the direction transformation from ^ g to ^ g when peculiar velocity is off. 



to convolve with f(v\ os \r p ,y). For measuring we count the 
numbers of galaxies around clusters in spherical shells of succes- 
sive radii, ranging from 10/i _1 kpc to 50/i _1 Mpc with logarith- 
mic intervals, average over all clusters in each bin, and normal- 
ize by the galaxy numbers expected in a randomly located shell of 
equal volume. We measure £*g m a similar way, counting galax- 
ies in cylindrical rings of successive tv for each r p (assuming a 
distant-observer approximation so that the LOS is an axis of the 
box). 

Fig. [7] compares the £* g from convolution to the simulation 
measurements for the fiducial mass bin from r p = 0.7/i _1 Mpc 
to 25.5 h~ 1 Mpc (r p increasing from top to bottom, left to right). 



In each panel, blue circles with errorbars are the £*g measured di- 
rectly from the simulation, while the solid curves are predictions 
from the best-fit GIK model using Equation ITT1 red triangles with 
errorbars are the £*g measured when the peculiar velocity of each 
object is set to zero in the simulation, i.e., setting ^(rp,?^) = 
£cg {\/ r p + r i) • There is overall good agreement between £*g pre- 
dicted from the best-fit GIK model and measured from simulation, 
except for the r p = 4.2 /i _1 Mpc and 6.7/i _1 Mpc panels where 
the model predictions are slightly higher than the measurements at 
small TV . The minor discrepancy probably comes from the stochas- 
tic noise in measurements along one particular sight line (similar to 
cosmic variance), as there is already discrepancy for the no-v pec 
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Figure 8. Top panels: Powered exponential fits (solid curves) to the measured (top left) and predicted (top right) profiles of £,cg/ w p at different r p for the 
fiducial mass bin. Bottom panels: Best-fit characteristic LOS distance 7V )C (bottom left) and shape parameter (3 (bottom right) from the top panels; Blue circles 
and solid curves show the best-fits to simulation measurements and predictions from the GIK model, respectively. The dashed curve in the bottom right panel 
shows the best-fit r^c for the measured when peculiar velocity is turned off. 



cases (red triangles vs. dashed curves) in the same two panels be- 
fore convolving with f(vi oa \r p ,y) — the redshift space correla- 
tions are measured along the z-axis of simulation box, while the 
real space correlation is measured assuming isotropy of the entire 
box. 

By comparing the two cases (v pec vs. no-v pec ) within each 
panel, Fig. [7] also nicely illustrates the two major RSD effects — 
along the LOS is suppressed by random dispersion for small 
r p , but is amplified by galaxy infall for large r p . To further quantify 
these deformations, we fit a powered exponential function to the 
£cg at each r p , which is equivalent to fitting a normalized powered 
exponential function to S 



E(7v|r p ) 



w p (r p ) 

















1 


,r exp< 




rn,c 





(12) 



where 7v,c is the characteristic length scale at which £* g drops to 
1/e of its maximum value at tv = 0. The shape parameter f3 yields 
a Gaussian cutoff for /3 — 2 and simple exponential for ft — 1, 
though any value is allowed in the fit. 

Fig. [8] summarizes the fits of S measured from simula- 



tion (crosses in the top left panel) and predicted by the best-fit 
GIK model (plus symbols in the top right panel) at four differ- 
ent r p . The fits (solid curves) in the two panels are very similar, 
so we focus on the top left panel here. The open circle through 
each curve indicates the position of rv^, which migrates from 
~ 6.5/i _1 Mpc at r p = 0.7 /i _1 Mpc inward to - 5.0/i _1 Mpc 
at r p = 2.7/i _1 Mpc, and then outward to ^ 8 and 10/i _1 Mpc 
at r p = 6.7 and 10.4 /i _1 Mpc, respectively. This "precession" of 
r-K,c(r p ) is more clearly shown in the bottom left panel, where the 
blue circles and solid curves plot the migration of rv^ as func- 
tion of r p from fits to the measurements and model predictions, 
respectively. This characteristic U-shaped curve is the same one 
shown in the right panel of Fig. Q] Also shown in the bottom left 
panel is r 7r , c (r p ) for the no-v pec case, which grows monotonically 
with increasing r p . This can be easily understood for a power-law 
Ccg oc r' 1 , where rv) oc (l + (r 7r /r p ) 2 )" 7/2 withnov pe c, 

hence rv^ oc r p . The change of slope around r p c± 2.5 h~ 1 Mpc 
for the no-^pec case is caused by the change of 7 during transition 
from the 1-halo to 2-halo regime. The bottom right panel shows 
the corresponding changes of (3 as function of r p , which largely 
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Figure 9. Effects of perturbing each GIK component on the characteristic U-shaped curve of . In each row, the amplitude of one of the components (from 
top to bottom: fraction and velocity dispersion of the virialized component f V i r and a v i r , radial velocity profile v ra d, radial velocity dispersion (J ra d, and 
tangential velocity dispersion atan) is changed from its fiducial value by ±50%, and we fit powered exponential functions to the predicted at each r p . 
The right panels show the impact of these changes on the r 7r ,c(r p ) curves, comparing the predictions of the modified model (blue and red) to the fiducial 
model (black). 



follow (in a slightly lagged fashion) the variations in r 7r ,c(r p ), and 
which are again well described by the GIK model. We will focus 
on r 7r , c (r p ) as the representative feature of £* g m me next section. 

Note that the powered-exponential is not intended to be a 
viable model of H, just a compact and visually appealing way 
of quantifying the characteristics of S. Therefore, although the 
powered-exponentials do not fit well on scales larger than ~ 
15/i _1 Mpc, the best-fit r-^c and f3 still effectively capture the 
main features of each curve. 



4 BAYESIAN INFERENCE OF VELOCITY 
DISTRIBUTION 

Armed with an accurate GIK model that correctly predicts the £*g 
signal, we are in the position to investigate the origin of the "U- 
shaped curve" of Fig. [T] and, more importantly, to examine the 
intrinsic degeneracies within the modeling of ^» which carries 
valuable information that we hope to exploit robustly. To achieve 
this understanding, we perturb the elements of the velocity field 
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Figure 10. Constraints on each component of the GIK model from the simulation measurements of £* g . Yellow and gray contours indicate the 68% and 95% 
confidence regions. The blue star in the top left panel and dashed curves in the remaining panels show the direct measurements from the simulation. 



one at a time, changing the amplitude of a single model component 
by ±50% while holding others fixed. In 32 we found that among 
the seven parameters in the GIK model (see Fig. 0]), a and dof 
are insensitive to cluster mass, so we focus on the remaining five 
parameters, linking cr v ir and / V ir so that there are four independent 
GIK components. Each row in Fig.[9]shows the results of perturbing 
the amplitude of one component: properties of the virialized pop- 
ulation (Jvir+ivir, characteristic infall velocity v r , c , radial velocity 
dispersion of the infall population cr ra d, or tangential velocity dis- 
persion of the infall population cr ta n. For the first row, we change 
the amplitude of cr v ir by changing the value of ao in Equation [9j 
and we simultaneously change the value of r s h proportionally with 
do, which in turn expands / v i r along the r axis (the inset axis). For 
the rest, since they all have similar parameterization, we change the 
amplitudes via multiplying pi and qi in Equation [7] and [TO] by the 
same factor. We will describe each row in turn, from top to bottom. 

• When cr v ir is increased and / v i r is expanded, the virialized 



region becomes hotter and larger, producing more flattened £* g for 
r p < r s h while having no effect for r p ^ r s h- This is the portion 
of the FOG effect caused by virialized dispersion. 

• When the infall velocity v r , c is faster (more negative), galaxies 
at large r p are shifted closer to the cluster center (more "Kaiser 
compression"), therefore reducing 7v jC . However, at small r p the 
infall is strong enough to send galaxies from one side of the cluster 
in real space (y < or y > 0) to the opposite side in redshift 
space (tv > or tv < 0). This is the portion of the FOG effect 
caused by infall. The characteristic projected separation r* at which 
the U-shaped curve reaches a minimum is then set by the transition 
from large-scale compression to small-scale inversion, shifting to 
a larger scale when infall becomes stronger. The value of 7v, c at 
the minimum also decreases slightly as v r , c increases. 

• When cr r ad is higher, the velocity ellipses of P(v r ,v t ) are 
more elongated along the radial velocity axis. Since the ellipses sit 
mostly at the negative half of the radial velocity axis, higher cr ra d 



© 0000 RAS, MNRAS 000, 000-000 



14 Zu et al. 



effectively leads to overall stronger infall. However, at smaller LOS 
distance (y < r p ), the LOS velocity distribution is insensitive to the 
changes in cr ra d (0 < tt/4 in Equation[3, so the stronger infall only 
starts to affect rv^ at large scales where r njC is comparable to r p , 
and the impact on 7V )C increases as a function of r p . 

• When crtan is higher, the velocity ellipses of P(v r ,vt) are 
more stretched along the tangential velocity axis, effectively in- 
creasing the dispersions of v\ os without modifying the mean. There- 
fore, similar to the effect of cr v i r on small scales, higher cr ta n in- 
creases 7V, c on all scales, though the fractional impact is largest at 
?V > r sh . 

The basic U- shape of the r n , c vs. r p curve is straightforward to un- 
derstand: FOG stretching at small r p gives way to Kaiser compres- 
sion at intermediate r p which gives way to Hubble flow expansion 
at large r p Q However, Fig. [9] shows that the detailed shape of this 
curve, and more generally of Q g {r P: tv), reflects a complex inter- 
play among the four components of the galaxy kinematics around 
clusters. 

Crucially, the impact of each GIK component has a distinct 
scale and amplitude dependence, suggesting that they can be in- 
ferred from measurements with only limited degeneracy. To 
confirm this expectation, we construct a Gaussian likelihood model 
with the measurements of £* g in Fig. [7] as the input data, and the 
functional parameters introduced in 33 as our model parameters. 
The parameters we vary in the model are, {gi, pi, f3i} for the char- 
acteristic infall velocity (Equation [7]), {ro, (Jo} for the virialized 
component (Equations [9] and [8]), {g2, pi, ri, #2} for the radial ve- 
locity dispersion (EquationQI)]), and {#3, P3, 7*3, A} for the tangen- 
tial velocity dispersion (Equation [TO]). In this way, we allow max- 
imum freedom for the shape and amplitude of each component to 
vary during the fit. We keep the remaining parameters fixed to their 
best-fit values in Fig. \5\ as they appear to be insensitive to cluster 
masses (see Fig. 0} — we keep the radial profiles of a and dof 
fixed in amplitude and shape, and we maintain the shapes of the 
cr v ir and /vir profiles, while allowing them to change scale along 
both axes via ro and ao. The Gaussian log-likelihood is thus 

\n£(e cg \0) « -\ (£ 9 - CMf C- 1 (& g - , (13) 

where 

= {gi, Pi, ^i,r ,o-o, 52, P2,r 2 ,^2,g3,P3, 7-3,^3}, (14) 

and C is the data covariance matrix measured from Jackknife re- 
sampling of the simulation volume0 For demonstration purposes, 
we use the direct simulation measurements of ^ r cg to convolve with 
the predicted f(v\ os \r p , y). When applying the model to observa- 
tions, should either be inverted from the measured w p (see 
or directly predicted from theoretical models (papers II and III). 

We adopt a Bayesian approach, assuming uninformative pri- 
ors for all the parameters that are allowed to vary. For the parame- 
ter inference, we sample the posterior distributions using a Markov 
Chain Monte Carlo (MCMC), where an adaptive Metropolis step 
method is utilized during the burn-in period. The whole chain has 
15,000 iterations, 5,000 of which belong to the burn-in period, 
where auto-correlation tests show good convergence before actual 
sampling. We emphasize that while we used the r 7r , c (r p ) curve for 

5 Even at large r p , infall reduces tv.c relative to the real space value. 

6 We divide the simulation box into octants, and derive the covariance from 
the measurements of the whole box and eight subsamples, each com- 
posed of the whole box minus one octant. 



illustration in Fig. our statistical inference of GIK uses the full 
£,i g ( r Pi r 7r), as indicated by Equation [T3l 

Fig. [TOl presents the constraints on the four GIK components 
inferred from the MCMC when we apply this fitting procedure to 
the measurements from the fiducial mass bin. Yellow and gray 
contours show the 68% and 95% confidence limits, respectively. 
For the constraints on v r , c , ow, and a tan profiles, we compute 
the median (68% and 95%) curves discretely at each r as the me- 
dian (68% and 95%) functional values calculated from all the iter- 
ations at that r, rather than the functional values calculated from 
the median (68% and 95%) parameters. The blue star in the top 
left panel and blue dashed curves in the other panels indicate the 
direct fits to simulation measurements of GIK in Fig. \5\ The con- 
straints derived from show overall agreement with the direct 
fits, confirming that the distinctive effect of each GIK component 
allows us to constrain the overall model with minimal ambiguity. 
In particular, the characteristic radial infall curve v r , c (r) is inferred 
with small uncertainty from this cluster sample (iVduster = 691) 
. The medians are slightly offset from the fiducial values, possibly 
because of the same stochastic noise that causes the discrepancies 
in Fig. [7] but the offsets are smaller than the 68% statistical uncer- 
tainty. 



5 APPLICATION TO SDSS GROUPS 

We defer a comprehensive calibration and application of the GIK 
model to the future, but as a proof of concept, we apply the cur- 
rent GIK model to measured for rich ga laxy groups found in 
the SDSS. We employ the group catalog of I Yang et all d2007h in 
the local universe (z < 0.2) found in the spectros copic data of 
SDSS Data Release 7 fDR7: lAbazaiian et alj[2009h . Each group 
was identified initially with a friends-of-friends scheme in the red- 
shift space and kept in the catalog by an iterative adaptive fil- 
ter method. For more details on t he co nstruction of the catalog, 
we refer the reader to lYang et all J2007h . Small groups are likely 
to suffer more contamination and to have weaker infall patterns. 
We therefore select 6691 groups that have relatively high stel- 
lar mass (log M*/M© ^ 11.0) in their brightest cluster galax- 
ies (BCGs) and have at least three identified galaxy members; 
We describe these groups as our "cluster" sample. We divide 
the 6691 clusters further into three BCG stellar mass bins, with 
logM* = 11.00 - 11.20 (N c = 4018), logM* = 11.20- 
- 11.40 (N c = 2027), and log M* = 11.40 — 11.90 (iV c = 646), 
respectively. For the galaxy sample, we use the dr7 2saf eO sam- 
ple w ithin the NYU Value-Added Galaxy Catalog telanton etaD 
l2005h derived from the main spectroscopic sample in DR7, con- 
taining N g — 534206 galaxies with K+E-corrected r-band magni- 
tudes between —24 and — 16 at z < 0.2. 

In contrast to the simulation where the expected number of 
cluster-random galaxy pairs is known precisely, we have to con- 
struct random catalogs of galaxy samples that have the same angu- 
lar and redshift selection functions as the observed galaxy sample. 
We generate the angular completeness map in the format of non- 
overlapping po lygons from the dr7 2 s a f e window function us- 
ing MANGLE dSwanson et al.l 2008), then draw N g random galaxy 
coordinates from each polygon based on its spectroscopic com- 
pleteness. To account for the redshift selection function, we ran- 
domly shuffle the redshifts of observed galaxies and assign them 
to the random coordinates. This shuffling procedure is equivalent 
to drawing redshifts from the parent redshift distribution of the ob- 
served sample when N g is large. To reduce the random noise, we 
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Figure 11. Measurements of £* g for SDSS groups in three bins of BCG stellar mass. The yellow U-shaped curve in each panel shows the characteristic LOS 
distance r^c at each projected separation r p . The contour scales are the same for all panels, indicated by the colour bars on top. 



repeat the process for ten times and construct a random galaxy sam- 



ple with iV, 



10 x N a 



We measure ^ g using the lDavis & Peeblesl dl983h estimator, 



N r NcG(rp,r„) 
N g NcRirp.r^) 



1, 



(15) 



where Ncg and Ncr are the cluster-galaxy pairs and cluster- 
random galaxy pairs, respectively. We estimate measurement un- 
certainties on ^ s cg using the Jackknife re-sampling methodQ In ad- 
dition, we also measure the projected cluster-galaxy correlations 
w p from the same data and random galaxy samples using an inte- 
gration length of r™ ax = 40 h~ 1 Mpc. 

Fig. Qj] shows the ^ g measured for the three bins of clus- 
ters and the characteristic U-shaped curves derived from £* g . The 
colour scales are the same in all panels, so it is clear that clusters 
in higher stellar mass bins have on average higher masses, show- 
ing stronger correlation signals at fixed (r p ,7v). The BCG stel- 
lar mass, however, is only a loose indicator of total cluster mass. 
To completely model £*g measured in bins of BCG stellar mass, 
we should predict ^ r cg and f(v\ os \r P: y) as functions of M to get 
£ s cg (M), then convolve £ s cg (M) with scatter in the M*-M re- 
lation weighted by the cluster mass function dn/dM. This type 
of comprehensi ve model has been u sed to interpret weak lensing 
me asurements ( Sheldon et all 12009 ) for SP SS MaxBCG clusters 
by iRozo et all bOlOh . iTinker et all (120121) . and IZu et all (120121) . 
and we will adopt it for £* g modeling in future work. For our 
firs t-cut analy s is her e, we will infer average GIK properties of 
the lYang et al.l d2007h groups by treating each M*-bin as though 
it were a single halo mass bin. 

The first step is to reconstruct ^ r cg fr om the measure ments 
of w p . We adopt the ^ r cg presc ription from Zu et al.l j2012l) . This 
prescription, similar to that of lHavashi & White! (120081) . sets the 



7 Since the number of clusters in each bin is small compared to galaxies, 
we construct Jackknife sub-samples by dropping individual isolated clus- 
ters or groups of angularly close clusters from each bin. The number of such 
sub-samples for each bin is > 400. 



White profile dNavarro et al.ll 19961 . 1 1991 NFW) and a biased non- 
linear matter correlation function. We refer to these two compo- 
nents as "1-h" (for one-halo) and "2-h" (for two-halo), respec- 
tively. To ensure a reasonable behaviour of ^ r cg on large scales, we 
fix the non-linear matter correlation to be that from a flat ACDM 
(Q m = 0.23, as = 0.79) universe a t z — 0.1, computed from the 
HALOFIT model dSmith et al.ll2003l) . Therefore, the £ g model has 
three parameters, including cluster bias, scale radius, and normal- 
ization of the NFW profile. For more details about the mod eling 
and calibration of we refer the readers to lZu et al.l (120121) . We 
compute w p by integrating ££g using the same r™ ax = 40 /i _1 Mpc 
an d fit it to the mea sure d w v for each c luster bin. As pointed out 
bv lCroftetal1(ll999l) and lLi et all (1201 2l) . although this reconstruc- 
tion does not necessarily recover the correct 1-h and 2-h terms, it 
provides a reasonable estimate of the underlying ^ r cg . 

Fig. [T2l shows the results of this reconstruction. The left panel 
compares the measured w p to the best-fits using the 3-parameter 
model, showing good agreement for the two highest stellar mass 
bins, with some discrepancies on large scales for the lowest bin. 
These discrepancies may be caused by deficiency of our simplified 
w p model, or by contamination from interlopers in the group cat- 
alog. To avoid further complications, we drop the lowest bin from 
our following analysis. The middle panel shows the best-fit ^ r cg 
for each bin, and the corresponding 1-h and 2-h terms. The curves 
for the two higher stellar mass bins will be the input ££g f° r our 
Bayesian inferences below. In the right panel, we put together the 
three U-shaped curves shown individually in Fig.Qj] They are qual- 
itatively similar to the curves we see in simulation, and the varia- 
tion with BCG stellar mass is as expected — higher mass bins have 
stronger infall and larger dispersion of virial motions, thus smaller 
r^c on large scales but larger rv jC on small scales. 

Given the best-fit ££g reconstructed from w v , we apply the 
same Bayesian inference described in $4] t0 me measurements of 
£,c g ( r Pi r 7r) for each of the two higher stellar mass bins of SDSS 
clusters. We vary the same set of parameters listed in Equation [T4l 
All the input data points of £* g are shown in Fig. [14] (discussed 
further below) as red triangles and blue circles with error bars. 
Fig. \\3\ presents the constraints on the average GIK for the two 
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Figure 12. Projected correlations w p (left), 3D real space correlations ^ g (middle), and characteristic U-shaped curves (right) for SDSS groups of different 
BCG stellar mass. Left: Circles with error bars show the measurements of w p from SDSS and solid curves are the best-fit models. Middle: Solid curves show 
the t;c g inverted from best-fit w p in the left panel. Dashed and dotted curves show the contributions from the 1-h and 2-h terms in the model. Right: Same as 
the yellow curves shown in Fig, fill 



bins. The overall results are remarkably similar to what we see in 
the simulation (compare to Fig. [10]>, but with some anomalies that 
likely arise from observational uncertainties. In particular, the two 
inferred values of ro (radius at which f v - ir = 1/e) are considerably 
larger than expected for clusters with M ~ 10 14 h~ 1 M Q (e.g., 
ro < 3/i _1 Mpc for all the mass bins in Fig. |4}. This difference 
is likely caused by a combination of mis-centering, contamina- 
tion, and scatter between BCG stellar mass and cluster mass, all 
of which blur the measurements on small scales, mimicking a 
much stronger virial component. However, even though we do not 
model these systematic effects, we find trends of each GIK com- 
ponent with stellar mass that track the trends with cluster mass 
seen in the simulation (Fig. [4]>. In particular, the high stellar mass 
bin has a higher amplitude of virial dispersion <ro, stronger infall 
v r ,c, smaller dispersion in radial velocities cr ra d, and higher (com- 
parable) dispersion in tangential velocities a tan on small (large) 
scales. The dashed curve in the upper right panel of Fig. [13] re- 
peats the black points in Fig. [4^, mark the infall curve v r , c (r) 
measured for 1.0 - 1.259 x 10 14 /i _1 M© halos in the Millen- 
nium simulation. The agreement with our SDSS measurement for 
the higher stellar mass bin is very good, indicating that the mass 
scale of these BCG log M*/M© = 11.4-11.9 groups is about 
1Q 14 h~ 1 My , while that of the log M*/M = 11.2-11.4 groups 
is lower. jLi et al.ll2012h obtained a similar mass estimate for the 
log M*/M© = 11.4 — 11.9 groups using the internal satellite 
kinematics (see their fig. 10). 

Fig.[T4lcompares the £* g measured from the SDSS to that pre- 
dicted from our best-fit models at ten different r p for the two stellar 
mass bins. The model provides a good overall fit to the measure- 
ments from r p — 0.8 h~ 1 Mpc to 13 /i _1 Mpc in the perpendicular 
direction, and from tv = 0.3 /i _1 Mpc to tv = 26 h~ 1 Mpc in the 
LOS direction for each r v . 



6 CONCLUSION 

We have developed a methodology for modeling the redshift-space 
cluster-galaxy cross-correlation function £cp(r p , tv), calibrating 
and testing it with halo and galaxy catalogs from the Millennium 
simulation and presenting a first-cut observational application to 
galaxy groups in the SDSS redshift survey. The crucial input to this 



modeling is the line-of-sight velocity distribution f(v\ os \r p ,y), 
which we derive from a more complete description of the galaxy 
velocity distribution P(v r , v t ) that we refer to as the GIK (galaxy 
infall kinematics) model. Our GIK model is a 2D mixture of one 
virialized component (Gaussian, with zeros means and equal dis- 
persions of radial and tangential velocities) and one infall compo- 
nent (bivariate t-distribution skewed along the radial velocity axis). 
This 2D mixture correctly accounts for the higher moments of the 
velocity distributions (skewness and kurtosis) and the internal cor- 
relation between radial and tangential velocities, providing an ex- 
cellent fit to the galaxy kinematics around simulated clusters from 
the inner 1 /i _1 Mpc to beyond 40 /i _1 Mpc. After convolution with 
the real-space cluster-galaxy correlation function, the GIK model 
accurately reproduces the redshift-space cluster-galaxy correlation 
function £* g measured in the simulation. 

The features of £* g , which we summarized by the characteris- 
tic U-shaped curve r 7r , c (r p ), are shaped by the complex interplay 
among the four distinct elements of the GIK model: the virialized 
velocity sphere, the characteristic radial infall velocity, and the ra- 
dial and tangential velocity dispersions of the infall component. 
However, each of these elements affects r 7r , c (r p ) differently, and 
using the Millennium mock data we have demonstrated that the £*g 
measurement alone is sufficient to allow reconstruction of the un- 
derlying GIK around clusters. We are especially interested in the 
characteristic infall curve v r , c (r), as we expect it to provide a diag- 
nostic of extended cluster mass profiles that is insensitive to galaxy 
formation physics that might affect velocity dispersions within ha- 
los. As a proof of concept, we measure f° r SDSS groups and 
apply our modeling to infer the GIK for two bins of BCG stellar 
mass. The four GIK components show the trends expected if total 
halo mass correlates with BCG stellar mass, and the infall curve 
v r ,c(r) for the higher mass bin is in excellent agreement with the 
Millennium simulation prediction for 10 14 /i _1 M© halos. 

In principle, the full galaxy pairwise velocity distribution (as 
function of galaxy properties), probed by the redshift-space galaxy 
auto-correlation function contains more information than 
availa ble in £* n . Current theoretical efforts, both in config uration 
space dTinker et al.ll2006l:lTinkerll2007l:lReid & Whitdl2Ql lh and in 
Fourier spa ce (Seljak & McDonald 2011; Okumura et al. 2012a.b; 
IVlah et all 120121 : iGil-Marm et all Eoil IZhang et all l2012h . are 
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converging to percent-level accuracy of modeling £ gg on lin- 
ear scales (and a few percent on quasi-linear scales), but they 
are significantly worse on the non-linear scales w here mea- 
surements are the most precise JScoccimarrol |2004|) . We have 
shown that galaxy infall onto clusters is relatively straightfor- 
ward to model, thanks to the deep potential and high cluster- 
ing bias of cluster mass halos. While clusters are rare compared 
to galaxies, increasing measurement shot noise relative to £* g , 
the high bias of clusters boosts the signal, so the loss of sta- 
tistical power may be limited, and high mass halos are an in- 
tere sting population to isolate in any case. The eROSITA satel- 
lite jMerloni et al.l2012[) and deep optical imaging surveys fro m the 
Dark Energy Survey |The Dark Energy Collaboration] l2Q05h and 
LSST (ILSST Science Collaboration! l20Q9h should detect - 10 5 
clusters above a 10 14 M© threshold, so with an overlapping galaxy 



redshift survey the potential measurement precision for £ gg is very 
high. 

In future work we will investigate the sensitivity of GIK to 
galaxy formation physics, including the dependence on large scale 
spatial bias and velocity dispersion biases within halos. In observa- 
tional studies with sufficient statistics, one can test for systematics 
by checking that different galaxy samples (e.g., blue vs. red) lead 
to the same cosmological conclusions. The main observational sys- 
tematics are scatter between the cluster observable and mass and 
mis-centering of clusters (in both angular and redshift positions). 
The large radius of the virial component that we find for our SDSS 
groups is likely a consequence of mis-centering effects. Cosmo- 
logical analyses can incorporate scatter and mis-centering into the 
model predictions and marginalize over uncertainties in their de- 
scription. However, mis-centering effects must be calibrated for 
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Figure 14. Comparison between the predicted and measured at ten different projected separations for SDSS groups. Blue circles and red triangles with 
error bars show the measurements for the two BCG stellar mass bins (marked in the top left panel), respectively. Solid curves show the predictions from the 
best-fit GIK model. 



any given cluster finding algorithm and observational data set with 
detailed simulations. 

As a probe of dark energy and modified gravity, GIK modeling 
of galaxy clusters complements stacked weak lensing analysis in 
both observational requirements and information content. Stacked 
weak lensing relies on overlap between a cluster sample and a deep 
imaging survey; forecasts for Stage III and Stage IV dark energy 
experiments predict cluster weak lensing constraints that are com- 
petitive with those fr om supernovae, baryon acoustic oscillations, 
and cosmic shear (see Weinberg etalJ 2012. §6 and §8.4). GIK anal- 
ysis requires overlap with a large galaxy redshift survey, such as the 
SDSS survey used here, the o ngoing Baryon Acoustic Oscillation 
Spect roscopic Survey (BOSS; lEisenstein et al.l201ll ; lDawson et al.1 
120121) and its higher redshift successor eBOSS, and the deeper sur- 
veys planned for future facilitie s such as BigBOSS dSchlegel et al.l 
120091) . DESp ec dAbdalla et al] I20I2L the Subaru Prime F ocus 
Spectrogr aph fellis et alJl2012h . Euclid (lLaureiis et alJl201lh . and 
WFIRST dGreen et alJl2012h . A combinaton of stacked weak lens- 
ing and £* g analysis for the same cluster sample would yield 
tigher dark energy constraints than either method on its own. 
Comparison of the two provides an important consistency test for 
GR, as many modified gravity models predict a "slip" between 
the gravitational p otentials that govern weak lensing and non- 
relativistic tracers Jjain & Khourvl Eoiol and references therein). 
Compared to ACDM+GR, modified gravity models predict dis- 
tinctive signatures i n halo statistics (|Chan & Scoccimarrol |2009; 



Schmidt et al. 2009; Li & Barrow 201 1) and galaxy redshift-space 
distortions (I Jennings et alj|20l3) . Most interesting of all for our 
purposes, screening effects in modified gra vity models lead to dis- 
tinctive signatures in halo density profiles dLombriser et alj|2 012) 
and galaxy phase-space density profiles around clusters dLam et all 



120121) . Redshift— space cluster— galaxy cross— correlations may be an 
especially sensitive diagnostic of such effects, so they could allow 
stringent tests of these theories, or even yield smoking gun evidence 
that deviations from GR on cosmological scales drive the acceler- 
ating expansion of the universe. 
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